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I.  IM1B0DDCTI0S 


1.  A01CHATI0N  OF  A  SUPPLY  SISTBH 

Cut  cf  a  need  fcr  a  data  automation  system  which  would 
handle  stock  points  ard  inventory  control  points,  grew  a 
need  for  a  plan  to  fcring  together  a  number  of  information 
systems  centered  around  stock  point  and  inventory  control 
point  applications.  The  Stock  Point  Logistics  Integrated 
Commucica ticns  Envircrment  (SPLICE)  concept  is  that  plan. 
This  ccncept  involves  the  distribution  of  a  number  cf  local 
area  networks  (LANs)  which  communicate  via  the  Defense  Data 
Network  (CDN).  This  thesis  will  take  a  brief  look  at  seme 
aspects  of  the  plan.  But  before  examining  the  SPLICE 
concept,  we  will  compare  Local  Area  Networks  and  Long 
Distance  Networks. 

B.  LCCAI  NETWORKS 

A  local  area  network  is  a  data  communications  system 
which  allows  a  number  cf  independent  devices  to  communicate 
with  each  ether,  including  computers, terminals,  mass  storage 
devices,  printers,  plotters  and  copying  machines.  A  local 
network  supports  a  wide  variety  of  applicatins  such  as  file 
editing  and  transfer,  graphics,  word  processing,  electronic 
mail,  database  management  and  digital  voice  £Bef.  1].  Each 
LAN  in  SPLICE  will  be  uniquely  configured  and  may  include 
seme  cr  all  of  the  above  components.  The  question  to  ask 
concerning  local  area  networks  is  "What  are  the  characteris¬ 
tics  which  make  up  a  lecal  area  network?" 

According  to  A.  S.  Tanenbaum,  reference  2,  local 
networks  have  three  distinct  characteristics: 


fit 


.... — -  -  - 
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1.  A  diamatsr  of  no-,  dots  than  a  few  kilometers 

2.  A  total  data  rata  exceeding  1  Mbps 

3.  Ownership  by  a  single  organization 

Long  distance  networks,  on  the  other  hand,  are  networks  like 
DDN.  A  lcng  distance  network  is  usually  owned  by  a  ccarouni- 
caticns  carrier  and  is  operated  as  a  public  utility  for  its 
subscribers,  providing  services  such  as  voice,  data  and 
video  [Ref.  1].  The  way  a  ccmmunica tions  system  will  allow 
effective  message  exchange  between  different  communities  of 
users  within  each  local  area  network  is  beyond  the  scope  of 
this  thesis. 

C.  SPLICE  A  HD  ITS  HfllTIONSHIP  WITH  OADPS-SP 

When  SPLICE  is  examined,  we  see  that  it  is  assigned  to 
augment  tbs  existing  Navy  stock  point  and  inventory  control 
point  ADE  facilities  which  support  the  Uniform  Automated 
Data  Processing  System-Stock  Points  (UADPS-SP)  [Ref.  3], 
This  system  was  one  cf  the  first  attempts  at  standardizing 
distributed  logistical  i  nf  crmaticr. .  The  evolution  cf 
UADPS-S?  will  be  traced  in  the  following  sections. 

1.  Origin  o*  UACES-SP 

The  original  concept  of  the  distributed  processing 
of  supply  transactions,  along  with  the  maintenance  cf  stcck 
records,  was  first  tested  at  NSC  Norfolk  in  1956.  Upcn  the 
successful  completion  of  tests,  a  number  of  computers  cf 
various  sizes  and  models  were  installed  at  a  few  NSC’s 
(Oakland,  Bayonne,  San  Diegc),  at  USD  Newport  and  at  NSY 
Charleston  in  1957  and  1  958  .  Prompted  by  a  push  for  stand¬ 
ardization  cf  DOD  logistics  management  systems,  in  February 
cf  1961,  the  Bureau  of  Supplies  and  Accounts  (presently 
NAVSUP)  established  a  full-time  committee  to  standardize 
procedures  as  well  as  equipment  at  major  stock  points.  The 
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IBS  1410  computer  was  selected  for  Stock  Point  UADFs  ir. 
January  of  1962.  1’pcn  completion  of  an  ADP  programming 
training  course,  each  participating  stock  point  was  assigned 
the  task  of  analysing  and  programming  a  particular  applica¬ 
tion.  Figure  1.1  lists  the  initial  applications  alcng  with 


HISTORY  OF  STOCK  POINT  UADPS 


Application 

A  Requisition  History  and  Status 

B  Receipts/Dues 

C  Demand  Processing 

D  Inventory  Control  File 

Maintenance 

E  Financial  Inventory  Control 

F  Stores  Accounting 

G  Cost  Accounting 

K  Payroll 


Activity 
NSC  Bayonne 
NSD  Newport 
NSC  Norfolk 
NSC  Oakland 


NSCs  San  Diego 
and  Oakland 

NSC  Pearl  Harbor 

NSC  San  Diego 

NSC  Pearl  Harbor 
(later  changed  to 
NSY  Long  Beach) 


Figure  1.1  Initial  Applications. 

the  activity  they  were  assigned  to  in  1962.  Tc  this  list,  a 
number  of  other  applications  have  been  aided  to  date.  Of 
course,  this  was  otly  the  beginning  of  a  system  which  has 
grown  and  evolved  over  the  years.  There  have  been  numerous 
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Today, 
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Burroughs  medium  sized  ( B-3500/37 0 0/473C/4 600)  systems. 
Presently,  nhere  arc  twenty  new  application  systems  being 
developed  which  require  considerable  interactive  and  tele¬ 
communication  support.  The  current  UADPS-S?  earner  support 
these  requirements  without  a  total  redesign  effort  and  will 
probably  require  future  replacement  of  the  current  main¬ 
frames  [Bef.  3].  Nevertheless,  as  the  Navy  Supply  System 
evolves  tc  meet  the  changing  fleet  needs,  F.1SC  will  adapt 
and  ad-just:  UADF3-SP  to  meet  these  emerging  requirements 
[Ref.  4], 

2  •  SPLICE  ,  a  corn  cut  er  net  work  in  support  cf  rJ  ADFS^SP 

Eetuming  to  the  SPLICE  concept,  it  has  been  decided 
that  the  Burroughs  computers  will  provide  background 
processing  functions  for  large  file  processing  ar.d  report 
generation,  [Ref.  3],  These  are  the  same  computers  used  in 
the  UADFS-SF  system. 

According  to  reference  3, 


SPLICE  will  be  develcoed,  hci 
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however,  usina  a  standard  set 
’  software.  This  standardiza¬ 
tion  is  particularly  important  because  SPLICE  will  be 
implemented  at  some  sixty  different  Geographical  loca¬ 
tions,  each  having  a  different  mix  of  application  and 
~  lents.  Additionally,  each  LAN  must  have 
_  communicating  wit  a  ether 
Setwork  (  DDN  )  ,  which 
Agency  ( 


Defense  Data _ _  „ _ 

the  Defense  Communications 


__  LANs  via  the 
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A  layout  of  the  local  area  network  (  LAN  )  can  be 
seen  in  figure  1.3  .  Figure  1.4  ,  on  the  other  hand,  shews 
the  logical  network  concept.  Each  local  area  network  will 
include  the  following  software  modules: 
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1.  local  Communications  {  LC  ) 

2.  Rational  Co  in  a  uni  cat  ions  (  NC  ) 

3.  Fzcrt-End  Processing  (  FEP  ) 

4.  Terminal  Management  (  TM  ) 

5.  Eata  Base  Management  (  DBM  ) 

6.  Session  Services  (  SS  ) 

7.  Peripheral  Management  (  PM  ) 

6.  Sescurce  Allocation  (  3A  ) 

The  above  modules  will  be  divided  into  these  modules  which 
perform  operating  functions  and  chase  which  support  the 
effective  use  of  these  modules  cn  the  local  area  network. 

E.  STANDARDIZATION  Cf  SPLICE  BY  DOD 

Cne  of  the  objectives  of  DOD  in  SPLICE  has  been  chan  of 
standardization .  Independent  development  of  local  area 
networks  would  cause  problems.  The  major  problem  would  be 
unnecessary  duplication  cf  effort  and  continued  production 
of  unique  hardware  and  software.  A  standard  system,  cn  the 
ether  hand,  would  be  more  economical  to  design,  develop, 
maintain  and  operate.  For  a  project  the  size  cf  SPLICE, 
standardization  is  the  only  wise  choice. 

E.  FUNCTIONS  OF  THE  EATABASE  MANAGEMENT  MODULE 

As  a  result  of  ongoing  research  in  the  implementation  of 
SPLICE,  this  thesis  will  address  those  issues  involved  with 
the  design  cf  the  Data  3ase  Management  Module  of  the  local 
area  networks.  The  functions  performed  by  the  Database 
Management  Module  as  outlined  in  reference  3  will  be  the 
following : 

1.  File  creation 

2.  File  update 

3.  Cuery  processing  and  data  retrieval 

4.  Eata  dictionary  creation  and  maintenance 

5.  File  catalog  creation  and  maintenance 
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Figure  1.5  gives  an  outline  of  a  back-end  database 
management  system  taken  from  [Hef.  3].  The  implementation 
of  SPLICE  for  the  Database  Management  Module  has-  been 
discussed  ir.  reference  3  and  the  conceptual  employment  cf  it 
according  tc  reference  5  is: 

"The  concept  employed  in  the  recommended 
implementation  of  the  database  and  Terminal 
Management  Resource  requirements  for  SPLICE 
center  around  a  h;ahly  decentralized  and 
loosely  coupled  distributed  local  area 
network  (  LAN  )  .  " 

The  processors  for  each  software  module  within  each  LAN  will 
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HISTORY  OF  STOCK  POINT  UADPS 


The  following  activities  have  implemented  baseline  UADPS-SP  or  significant 
subsystem/application  segments: 


UADPS-SP  ACTIVITIES 


MCAS  Cherry  Point 

MCAS  El  Toro 

MCAS  Yuma 

NAF  Washington 

NAS  Alameda 

NAS  Atlanta 

NAS  Barbers  Point 

NAS  Cecil  Field 

NAS  Corpus  Christi 

NAS  Glenview 

NAS  Jacksonville 

NAS  Lemoore 

NAS  Memphis 

NAS  Miramar 

NAS  Moffett  Field 

NAS  New  Orleans 

NAS  Norfolk 

NAS  North  Island 

NAS  Patuxent  River 

NAS  Pensacola 

NAS  Point  Mugu 

NAS  South  Weymouth 

NAS  Whidbey  Island 

NAS  Willow  Grove 

NSC  Bayonne  (Disestablished) 


NSC  Charleston 

NSC  Long  Beach  (Disestablished) 

NSC  Norfolk 

NSC  Oakland 

NSC  Pearl  Harbor 

NSC  Puget  Sound 

NSC  San  Diego 

NSD  Guam 

NSD  Newport  (Disestablished) 

NSD  Subic  Bay 
NSY  Norfolk 
NSY  Philadelphia 
NSY  Portsmouth 
NAVSUBASE  Bangor 
NAVSUBASE  New  London 
NAVSUBASE  Pearl  Harbor 
ASO  Philadelphia 
NPFC  Philadelphia 
NARDAF  Newport 
NAVMTO  Norfolk 
NAVRESUPPOFC  New  Orleans 
PMOLANT  Charleston 
PMOPAC  Bremerton 
SPCC  Mechanicsburg 
SWFPAC  Silverdale  WA 


Figure  1.2  Baseline  UADPS-SP  Application  Segments. 
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Figure  1.3  Local  Area  Network  Layout. 
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Figure  1.5  Outline  cf  a  Back-end  Database  Sanagesent  Systei 
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II.  HI  COM  CEP IP  A I  DATABASE  SC  HE  BE 


A.  DAT AEASES  PBESENTLY  A  PABT  OF  SPLICE 

The  present  databases  of  projects  under  the  umbrella  of 
the  SELICE  project  vary  greatly.  None  of  the  databases  are 
standard.  It  is  ere  of  the  purposes  of  this  thesis  to 
propose  a  new  conceptual  design  of  a  database  in  the  SPLICE 
projec*  which  will  help  standardize  database  operations  for 
all  sites  involved  with  this  project.  Since  each  LAN  will 
have  a  database  management  module,  standardizing  databases 
will  allow  users  tc  query  databases  easier  from  remote 
sites.  All  queries  could  be  standardized  as  veil.  This 
cnapter  will  discuss  the  conceptual  design  of  such  a  data¬ 
base  . 

B.  DEFISITION  OF  f H AT  A  COHCEPTOAL  FIEJf  EHTAILS 

The  conceptual  view  is  a  representation  of  the  er.tir® 
information  content  of  the  database,  in  a  form  that  is 
somewhat  abstract  in  comparison  with  the  way  in  which  the 
data  is  physically  stored. ..the  conceptual  view  consists  of 
multiple  occurences  of  multiple  types  of  a  conceptual 
record...  the  conceptual  view  is  defined  by  means  of  the 
conceptual  schema,  which  includes  definitions  of  each  of  the 
various  types  of  conceptual  record  [Bef.  6],  This  means 
that  the  ccnceptual  view  of  a  dataoase  shows  the  overall 
content  cf  the  database.  The  ccncaptual  schema  defines  that 
view . 
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•  Def  initicn  of  SPLICE  database 

The  database  with  which  we  are  concerted  is  a  data¬ 
base  which  contains  information  about  parts.  These  parts  are 
parts  for  ships,  airplanes,  etc.  Therefore  we  can  assume 
that  basically  this  database  will  be  a  system  which  invento¬ 
ries  parts.  In  a  database  of  this  type  certain  information 
is  important: 

1.  Stock  number  cr  Manufacturer's  number 

2.  Name  of  the  manufacturer,  if  it  applies 
tc  this  item 

3.  The  cost  of  each  item 

а.  The  quantity  cf  items  available 

5.  The  location  cf  the  item.  The  Activity 

б.  A  brief  description  of  that  item. 

This  is  the  minimum  amcun*  of  information  which  is  required 
for  an  inventory  system. 

2.  Approaches  used  to  Represent  a  Database 

The  next  thing  to  decide  is  the  kind  of  approach  to 
be  used  tc  represent  the  data  in  the  database.  The  best 
known  approaches  are  relational,  hierarchical,  and  network. 
The  approach  proposed  in  this  thesis  will  be  the  relational 
approach.  The  relational  approach  to  data  is  based  or.  the 
realization  that  files  that  obey  certain  constraints  may  be 
considered  as  mathematical  relations,  and  hence  that  elemen¬ 
tary  theory  about  relations  may  be  brought  tc  bear  cr. 
various  practical  problems  of  dealing  with  data  in  such 
files  [Ref.  7].  Notice  the  relations  given  in  figure  2.1 
These  table-like  structures  are  called  relations.  The  rows 
cf  such  tables  or  relations  are  called  "tuples"  and  the 
columns  are  usually  called  "attributes".  One  concept  that 
relational  theory  emphasizes  and  for  which  there  aces  not 
seem  tc  be  an  established  data  processing  term,  is  the 
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concept  cf  the  domain.  A  domain  is  a  pool  of  values 
which  the  actual  values  appearing  in  a  given  column 

drawn  [Bef.  7]. 

3*  Proposed  Conceptual  Database  Design 

In  figure  2.1  notice  relations  cf  which  the  dat 
is  composed.  There  are  four  relations  to  be  considered, 
first  cf  these  is  the  Stock-Part  relation  which  include 

1.  Stock  Humber  (Stock-Hum)  -  This  number 

is  the  Federal  Stock  number,  a 
thirteen  digit  number,  normally,  which 
is  assigned  to  all  stock  parts.  The 
stock  numbers  could  be  listed  in  a 
user's  manual  which  could  be  placed  or. 
secondary  storage  cr  online.  When  the  stock 
number  list  is  updated,  an  updated  version 
of  the  user's  manual  could  be  printed. 

(Note:  The  format  of  the  Federal  Stock 
Number  is  given  below: 

1.  Digits  1  -  4  Federal  supply 

Classification 

2.  Digits  5-6  National  Ceding 

Bureau  Numcer 

3.  Digits  7-13  National  Item  ID 

Num  bar 

Additionally,  digits  14  -  15  are 
used  for  Weapon  Systems  and  Aviation 
parts. ) 

2.  Manufacturer  Number  (Mf-Nura)  -  This  is 

assigned  tc  the  part  by  the  manufacturer. 
Since  there  is  no  consistent  way  cf 
numbering  parts  by  manufacturers,  it  would 
be  bast  if  we  did  net  use  their  numbering 
scheme  to  inventory  parts.  There  could 
also  be  a  duplication  of  manufacturer's 
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numbers  because  of  the  inconsistencies 
caused  by  the  lack  cf  standards  used  by 
manufacturers.  The  use  of  Federal  Stock 
numbers  would  eliminate  this  problem. 

Manufacturer's  Name  (Mf-Name)  -  The 

manufacturer's  name  is  given.  Some  portion 
of  it  could  be  abbreviated. 

Part  Name  (Part-Same)  -  This  gives  the 
general  category  of  a  part,  i.e.  rudder. 

Quantity  (Quantity)  -  This  is  the  number 
of  parts  on  hand  at  that  particular  time. 

Ccst  (Cost)  -  This  is  the  cost  of  each  item. 

Details  (Details)  -  This  will  give  more 
details  than  the  part  name.  Information 
such  as  the  dimensions  of  the  part  (Size, 
length,  etc.)  are  given.  The  differences 
ir  dimensions  will  cause  the  stock  number 
to  change,  i.e.  a  1/2  inch  screw  has  a 
different  stock  number  than  a  1/4  inch 
screw . 

Fecrder  Point  (Sscrd-Ft)  -  This  is  the 

pcint  at  which  the  inventory  is  replenished 
for  this  part.  When  guantity  gets  below  t'ni- 
pcint  the  Vendor's  list  must  be  consulted 
tc  reorder  stock  parts  (True  for  Government 
equipment)  . 

Weight  (Wt)  -  This  is  the  attribute  which 
gives  the  weight  cf  the  part  in  terms  of 
pounds,  i.e.  pcun ds/parts. 

.  Total  Weight  (Tct-Wt)  -  This  attribute 
gives  the  tctal  weight  of  all  parts  with 
stock  number, 
ts:  The  key  is  Stock-Num. 


The  second  relation  is  Local-Net.  This  relation  provides 
fast  access  to  information  indicating  the  sites  where  stock 
parts  nay  be  found.  The  attributes  included  in  this  rela¬ 
tion  are: 

1.  Stock  Number  (Stock-Hum)  -  Same  as  stock 

number  attribute  in  the  Stock-Part  relation. 

2.  Database  I.D.  (DB-Id)  -  This  is  the  number 

which  will  be  assigned  to  each  database. 

3.  Site  Humber  (Site-Hum)  -  This  is  the  number 

which  will  be  assigned  to  each  site  within 
the  SPLICE  system. 

4.  where  (Where)  -  The  location  within  the 

SPLICE  sits  of  a  particular  part,  e.g. 

Charleston . 

Note:  The  key  is  Stock -Nu a. 

The  third  relation  is  Tender-List.  It  is  a  list  of  all 
venders  who  service  this  LAN.  The  attributes  are  as  fellows: 

1.  Stock  Number  (Stcck-Nua)  -  Same  as  stock 

given  in  prior  relations. 

2.  Quality  Vender  List  (QVL)  -  This  is  the 

list  of  vendors  that  government  agencies 
are  allowed  to  procure  parts  from.  This 
list  is  predefined  and  could  be  placed  cn 
secondary  storage.  When  a  list  is  needed 
it  could  be  printed  at  that  time. 

3.  Bid  1  (Bidl)  -  Gives  the  name  of  the  vender 

who  bid  on  the  parts  contract.  His  bid 
is  also  included. 

4.  Eid  2  (Bid2)  -  same  as  Bidl. 

5.  Eid  3  (Bid3)  -  Also  the  same  as  Bidl. 

6.  Eid  Evaluation  (Bid-Eval)  -  This  attribute 

lists  the  vendor  who  won  the  bid. 

7.  Purchase  Order  (Purch-Ord)  -  This  attribute 

gives  the  purchase  order  number.  If  this 
attribute  is  known,  we  can  collect 
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historial  information  or  data  concerning 
ordars. 

fi.  Lead  Tima  (Lead-Time)  -  This  attribute 

gives  the  amount  of  time  which  is  needed 
between  the  time  the  order  is  placed  and 
the  shipment  is  delivered. 

Note:  The  key  is  Purch-Ord. 

Tha  final  relation  is  Location-Mf.  This  relation  contains 
information  concerning  the  manufacturer  of  the  part  and 
includes  tha  following  attributes: 

1.  Manufacturer  Number  (Mf-Num)  -  Same  as 

previously  described. 

2.  Manufacturer  Location  (Mf-Loc)  -  This 

attribute  gives  the  city  the  manufacturer 
is  located  in. 

2.  Address  (Address)  -  This  attribute  gives 

the  mailing  address  of  the  manufacturer. 

4.  State  (State)  -  This  attribute  gives  tha 

state  the  manufacturer  is  located  in. 

5.  Zip  (Zip)  -  This  attribute  gives  the  zip 

code  in  the  manufacturer's  address. 

6.  Phone  Number  (Phcne-Num)  -  This  attribute 

gives  the  phone  number,  which  includes 
the  area  cede,  of  the  manufacturer' s 
representative  or  saleperscn. 

7.  Salesperson  (Salesperson)  -  This  attribute 

gives  the  name  of  the  person  who  sold  or 

was  responsible  for  the  sale  in  a  procurement 

contract. 

Note:  The  key  is  Mf-Num. 

Thus,  we  have  a  relational  database.  Its  data  are  net  only 
informative  but  also  historical  in  nature.  with  purchase 
crier  numbers,  lead  time  and  manufacturer  information,  it  is 
possible  to  determine  when  parts  were  delivered.  The 
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purchase  order  number  cculd  be  placed  in  the  attribute 
details  along  with  its  size#  etc.  The  lccal  network  rela¬ 
tions  cculd  be  used  tc  help  update  parts  within  the  SPLICE 
system. 

Belation  Stock-Part  has  an  attribute  called  weight. 
This  attribute  is  the  weight  of  a  part  in  pounds/part.  When 
used  in  conjunction  with  the  total  weight  attribute,  a  quick 
check  can  be  made  cn  the  number  (quantity)  of  available 
parts  with  this  stock  number#  i.  e.  dumber  of  parts  =  (Total 
Weight) / (He ight  in  pcunds/part) .  If  this  number  is  not  the 
same  as  the  attribute  quantity,  there  is  a  possible  theft  or 
missing  part. 

A  number  of  queries  may  be  performed  cn  this  data¬ 
base.  The  table-lika  structures  of  relational  databases  make 
the  results  of  operations  performed  on  a  database  easy  to 
understand.  The  operations  included  in  this  thesis  are: 

1.  Selections 

2.  Projections 

3.  Joins 

A  selection  is  an  operation  which  asks  for  these 
tuples  in  a  certain  relation  that  meet  a  certain  criterion. 
For  example,  using  the  Vendor-list  relation,  the  Name  and 
Bid  cf  the  vender  whe  made  the  first  Bid  cn  a  particular 
purchase  order  cculd  fca  selected. 

A  projection  is  an  operation  which  takes  a  relation, 
removes  seme  of  the  attributes,  and  rearranges  some  of 
remaining  attributes,  if  necessary.  For  example,  using  the 
Local-Net  relation,  if  the  Database  ID  and  site  Number  were 
needed  tc  answer  a  query  the  information  could  be  printed 
giving  the  Site  Number  first  followed  by  the  Database  ID  as 
opposed  tc  the  way  it  presently  appears  in  the  relational 
table.  Therefore  data  could  be  formatted  in  any  manner  the 
user  wanted. 
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Finally,  there  is  the  join  operation.  A  jcir.  is  at 
operation  which  combines  data  of  two  or  more  relations-.  For 
example,  using  the  Stock- Earts  and  Location-!lf  relations, 
the  addresses  cf  manufacturers  with  a  particular  stock 
number  cculd  be  found  using  the  join  operation  in  conjunc¬ 
tion  with  the  selection  operation.  All  of  the  above  opera¬ 
tions  can  be  used  to  answer  a  query  for  the  user. 

The  relations  in  this  database  are  rather  Icr.g,  but 
they  are  designed  that  way  in  order  to  avoid  having  many 
joins  performed  cn  relations.  Zach  relation  gives  as  much 
information  as  is  necessary  for  that  particular  relation.  If 
all  cf  the  information  is  not  needed,  selections  car.  be  made 
cn  which  attributes  should  be  projected  by  the  user.  The 
user  would  have  nc  need  of  joining  numerous  relations 
together  in  order  to  get  the  information  he  needs. 
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Figure  2. 1  Relational  Database  Tables 


III.  ERCBIECS  IN  2IS  LOCAL  AREA  NETWORK  DATABASE  MANAGEMENT 

MOD OLE 

A.  POSITIVE  CHARACTERISTICS  OP  A  DISTRIBUTED  SYSTEM 

The  local  area  networks  in  the  SPLICE  system  are  made  cf 
distributed  systems.  Some  of  the  charact eristics  cf  a 
distributed  system  are  given  below.  First,  minicomputers  are 
used  in  these  systems.  Secondly,  distributed  systems  give 
users  more  individualized  control  over  processing.  The  sche¬ 
duling  of  jobs  and  qualify  of  services  can  be  determined  by 
the  user  himself.  Finally,  distributed  systems  can  be  mere 
readily  tailored  to  organizational  structures.  Since  nc  twe 
organizations  are  exactly  alike,  flexibility  is  important. 

B.  NEGATIVE  CHARACTERISTICS  OP  A  DISTRIBUTED  SYSTEM 

There  are  also  negative  characteristics  cf  distributed 
systems.  First,  the  procedures  required  to  implement  distri¬ 
buted  systems  are  complex.  Communications  facilities  must 
be  procured  and  computers  must  be  connected.  Data  compar¬ 
ability  also  must  be  ensured.  Ihus,  a  number  of  problems  may 
occur  in  a  distributed  environment.  These  problems  and 
possible  solutions  will  be  discussed  in  the  following 
sections. 

C.  PROBLEM  AREAS 

1  •  Access  Con t r c  1  and  Security 

Cue  cf  the  biggest  problems  in  a  distributed  system, 
as  in  any  computer  system,  is  access  control  and  security. 
If  data  are  online  and  there  are  multiple  users  who  may  be 
able  tc  access  that  data,  care  must  be  taken  as  to  how  data 
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are  altered.  It  is  very  important  that  the  accuracy  or  data 
are  preserved.  Data  car.  be  altered  in  two  ways: 

1.  Accidentally,  through  typing  errors  or 
programming  errcrs 

2.  Intentionally,  through  malicious  misuse 
cf  a  database. 

Ic  prevent  incorrect  data  from  being  stored  in  a 
database  cr  being  read  by  unauthorized  personnel,  there  are 
two  areas  cf  concern.  According  to  'Jllaan,  these  concerns 
are : 

1.  Integrity  preservation,  and 

2.  Security. 

Integrity  preservation  is  guarding  against  a  nonmalicicus 
error.  This  can  be  done  by  writing  a  program  in  such  a  way 
that  it  checks  for  conflicting  raccrds  before  anything  is 
completed  (updates,  inserts,  etc.).  Security,  cr  access 
control  as  it  is  sometimes  called,  as  concerned  with 
restricting  access  of  users.  Only  those  persons  with  a  "need 
to  knew”  should  have  access  to  particular  data.  Therefore, 
modification  and  alteration  would  only  be  performed  by 
authorized  personnel.  These  precautions,  if  taken,  would 
allow  more  control  over  data  and  thus  preserve  -he  accuracy 
tc  a  greater  extent.  As  a  bear  minimum,  all  online  files 
should  have  access  control  usinq  account  numbers  and  accom¬ 
panying  passwords.  These  passwords  would  restrict  tbs  users 
tc  data  which  is  needed  only  by  him  or  ner  to  get  the  job 
dene  (for  example,  read-only  passwords).  This  is  needed  just 
for  control  cf  everyday  usage  of  the  database.  A  number  cf 
additional  precautions  can  be  taken  to  further  ensure  that 
data  are  lass  vulnerable.  Encryption,  the  coding  of  data  so 
that  it  is  unintelligible,  is  oeina  used  in  the  SPLICE 
system  as  a  further  precaution  for  the  protection  cf  data. 
This  is  net  effective,  however,  unless  the  proarams  which 
encrypt  data  are  protected  from  would-be  infiltrators. 
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2 •  Concurrent  a rdatss 

A  problem  associated  with  all  database  systems  is 
that  cf  concurrent  updates.  This  is  the  problem  which  may 
occur  in  any  system  which  has  more  then  one  user  updating 
the  same  file  at  the  same  time.  In  a  distributed  system, 
where  there  are  concurrent  executions,  there  is  always  the 
chance  of  there  being  problems  with  livelock  and  deadlock. 
Livelcck  is  a  situation  where  there  may  be  one  or  mere 
processes  waiting  on  a  locked  item.  Using  a  first-come- 
first-served  strategy  cf  locking  items  usually  resolves  this 
problem.  Afterwards,  all  locks  are  released.  Deadlock  is  a 
situation  in  which  each  member  of  a  set  of  transactions  is 
waiting  to  lock  an  item  already  locked  by  another  transac¬ 
tion  in  the  set.  Since  each  transaction  of  the  set  is 
waiting,  it  cannot  unlock  other  transactions.  Therefore  all 
cf  them  wait  indefinitely.  Detection  and  prevention  are  two 
ways  of  handling  deadlocks. 

Locks  should  be  placed  cr.  all  items  tc  be  updated 
before  updates  are  dene  in  order  to  solve  the  concurrency 
update  preiism.  Certain  transactions  prevent  other  transac¬ 
tions  from  accessing  a  data  item  until  that  item  is 
unlocked.  Therefore,  ether  users  cannot  access  that  portion 
cf  the  database.  This  is  particularly  important  in  distri¬ 
buted  environments  because  cf  the  various  locations  cf  data. 
If  the  lccking  approach  is  used,  the  items  to  be  locked 
should  be  fairly  iarce,  maybe  even  entire  relations.  This 
would  reduce  some  of  the  costs  associated  with  the  locking 
mechanism. 

When  viewed  ever  the  entire  SPLICE  system,  databases 
are  ctvicusly  geographically  distributed.  For  the  purposes 
of  maintaining  adequate  control  over  cataloging  files  and 
maintaining  the  integrity  cf  related  files  in  fche  database 
(synchronization  of  updating  procedures),  the  database 
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functions  are  centralized  within  each  LAN.  Other  than  the 
fact  that  seme  files  will  remain  on  the  Burroughs  Hosts  and 
some  files  will  migrate  to  an  interactive  D3M,  there"  is  r.c 
reason  to  provide  for  the  distribution  of  databases  within  a 

LAN  {  Hef .  3]. 

3 •  System  Crashes 

Cne  problem  which  needs  consideration  is  hew  to 
handle  cr  prevent  system  crashes,  When  a  computer  fails,  all 
or  part  of  a  transaction  may  have  been  completed.  There  is 
no  sure  way  to  tell  exactly  what  has  happened.  For  that 
reason,  it  is  essential  that  backup  copies  cf  files  be  made 
periodically.  Per  larger  databases,  copies  can  oe  made  less 
frequently  than  smaller  ones  because  of  the  amount  of  time 
it  takes  for  copying  large  databases.  During  recovery  after 
a  failure,  a  determination  has  to  be  made  as  to  which  tran¬ 
sactions  should  be  repeated.  For  this  reason,  a  leg  or 
journal  is  needed  which  will  contain  information  concerning 
all  changes  to  the  database  since  the  last  backup  copy  was 
made.  Recovery  from  failure  is  particularly  a  problem  with 
online  systems.  There  may  be  no  copies  of  the  transactions. 
Therefore,  it  may  be  very  difficult  to  recreate  the  transac¬ 
tions.  Tie  reprocessing  may  not  repeat  the  exact  processing 
sequence . 

Finally,  consideration  should  be  given  to  preventing 
and  handling  system  crashes  and  of  dealeng  wi* h  whether  a 
transaction  has  been  ’’committed"  or  not.  This  is  the  point 
at  which  a  transaction  is  considered  complete,  when  transac¬ 
tions  must  be  ’’redone’’  or  ’’undone*',  a  log  or  journal  which 
contains  those  which  have  committed  will  assist  in  recon¬ 
structing  the  database.  According  to  Ullman,  [Hef.  6]  on* 
could  define  a  two-chase  commit  policy  which  would  operate 
as  follows: 
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1.  A  transaction  cannot  write  ir.ro  the  database 
until  it  has  committed. 

2.  A  transaction  cannot  commit  until  it  has 
recorded  all  its  changes  to  items  in  the 
journal. 

All  unlocking  is  done  after  the  committed  transactions  have 
occurred.  Uncommitted  transactions  cannot  be  input  to  the 
database.  If  a  crash  occurs,  destroyed  data  can  be  redone 
using  the  backup  copy  of  committed  transactions .  A  message 
could  be  sent  to  the  user  warning  him  about  transactions 
which  have  not  been  completed.  Also,  after  a  crash,  locks 
may  still  be  in  place  cn  data  items,  a  recovery  routine  will 
have  to  remove  these  locks. 


4  •  Da+a  Location 
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(2)  .  Centralized  or  Partitioned. 

It  is  not  easy  to  know  exactly  whera  to 
put  global  data.  The  first  consideration  is  whether  it 
should  be  centralized  or  partitioned.  If  it  is  centralized, 
one  computer  in  the  distributed  network  stores  the  data.  A 
request  is  sent  to  this  computer  when  global  data  is  needed. 
Partitioned  data,  however,  is  spread  over  several  computers. 
If  data  is  needed  ,  its  location  has  to  first  be  determined 
and  then  accessed. 

The  advantage  of  centralized  data  is  that 
every  ncde  knows  the  location  of  data.  Inis  is  the  approach 
which  will  be  used  in  the  SPLICE  system  for  the  Database 
Management  Module  in  each  LAN.  This  makes  the  system  ciiple. 
The  concurrent  update  problem  would  only  be  handled  by  one 
processor .  However,  if  global  data  is  needed,  i*  is  possible 
that  a  performance  bottleneck  will  develop  when  accassina 
data  from  one  computer.  Partitioned  data  would  net  cause 
this  kind  of  problem. 

Reliability  also  has  to  be  considered. 
With  centralized  data,  if  the  one  database  computer  fells, 
all  nedas  in  the  local  network  would  have  to  disco t*i rue 
processing.  When  global  data  i?  needed  in  a  partitioned  lata 
system  if  one  node  fails,  all  ether  nodes  in  the  r.e-werk 
could  continue  processing.  Unfortunately,  up  dates  of  parti¬ 
tioned  data  are  much  harder  tc  control  [Ref.  8 ].  A  cen*ral- 
ized  system  can  be  configured  to  continue  operation,  in  the 
case  cf  failure,  by  using  redundant  hardware  and  software, 
combined  with  mirrored  disk  operation  and  checkpcin-ing. 
This  will  be  the  way  SPLICE  will  handle  tris  problem. 

(3) .  Re  plicate  d  c,r  Nonr  e  plicat  sd  . 

There  should  be  a  number  of  copies  of  -he 
data  s-cred  in  the  network.  To  replicate  centralized  olotal 
data,  the  entire  collection  of  global  data  is  stored  at 
several  locations  in  the  network.  When  this  is  acne,  tha 
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nodes  need  only  keep  lists  of  the  computers  having  the 
replicated  data.  They  do  not  need  directories  that  she*  the 
locations  of  particular  kinds  of  data;  all  of  the  data  is 
located  at  each  of  the  nodes.  Also,  replicating  global  data 
eliminates  the  bottleneck  and  reliability  problems  discussed 
above,  tut  introduces  the  problem  cf  concurrent  update 
contrcl  [Bef.  8]. 

When  data  is  partitioned  or  replicated, 
each  node  must  have  access  to  a  directory  that  gives  the 
location  or  locations  of  aach  type  of  data  £Hef.  8]. 
Therefore,  when  a  user  tries  to  access  global  data,  the 
operating  system  cr  DBMS  calls  upon  the  directory  to  find 
the  location  of  that  data. 

Finally,  Kroenke  states  that  the  greatest 
amount  cf  flexibility  can  be  found  with  the  replicated, 
partitioned  storage  of  data  across  a  network.  However, 
control  is  much  more  difficult.  More  complexity  is  also 
added  to  the  operation  of  the  network. 

When  updating  data  in  a  system  that  has 
replicated  data,  there  is  an  issue  which  should  be  consid¬ 
ered.  In  a  centralized,  nen replicated  data  system,  an  appli¬ 
cation  program  can  lock  records  before  they  are  used.  Ihe 
lock  only  involves  data  in  a  single  computer,  whereas  with 
replicated  data  a  lock  would  have  to  be  placed  cn  all 
computers  with  the  global  data  item.  The  problem  with  this 
is  that  if  locks  are  applied  simultaneously,  one  user  could 
possibly  have  the  record  locked  on  one  computer  while 
another  user  has  the  record  locked  on  another  computer. 
Neither  user  has  complete  control  because  both  locks  are  in 
place  for  two  different  users.  The  system  has  to  resolve 
this  conflict.  This  process  can  waste  a  lot  of  time. 

Sven  nonr eplicated,  partitioned  data  can 
cause  prctlems.  If  a  transaction  is  to  be  applied  tc  several 
different  records.  There  is  no  problem  if  -his  upda-®  is 
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done  cn  ere  computer.  If  some  of  the  data  are  on  differ enr 
computers,  however,  there  could  be  two  users  locking  each 
ether  cut. 

These  are  concurrency  problems.  The  reso¬ 
lution  cf  these  problems  is  an  active  research  area.  Even 
though  the  concurrency  updating  problem  is  of  concern,  the 
scope  of  this  thesis  does  not  include  the  resolution  cf  such 
problems. 
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IV.  2SI  DATABASE  MACHINE  AS  AH  ALTgSN ATI VB  DESIGN 


A.  GEOHTH  POTENTIAL  CP  SPLICE  PROJECTS 

The  database  computer  (DBC)  or  the  database  machine 
(DBM)  ,  as  it  is  sometimes  called,  should  be  considered  as 
ore  cf  the  hardware  alternatives  for  implemer.tatitg  the 
Database  Management  System  (DBMS).  By  offloading  OEMS  func¬ 
tions  from  the  host  application  computers  to  DBM,  applica¬ 
tion  processing  speed  is  increased  and  SPLICE  application 
growth  can  be  more  readily  absorbed. 

B.  CONVENTIONAL  COHEOTEHS 

1 .  Designed  us i _ng  Large,  Complex  Software 

Large  databases  of  the  future  may  need  a  DEM. 
Because  the  database  machine  is  a  special  purpose  machine, 
which  can  handle  DBMS  efficiently,  large  databases  of  the 
future  will  more  than  likely  be  managed  by  database  machines 
as  opposed  to  conventional  database  management  software. 

In  the  following  paragraphs,  DBMs  will  be  examined. 
Actual  models  of  DBMs  will  be  presented  and  the  different 
approaches  to  DBMs  architectures  will  be  discussed. 

2.  Designed  *o  Access  Data  by  Physical  Address 

Conventional  computer  architectures  and  applications 
have  teen  designed  to  refer  to  physical  addresses  in  order 
to  address  data.  With  the  increasing  number  of  applications 
which  are  centered  arcund  information  storage  and  retrieval, 
the  conventional  systems  are  unable  to  retrieve  information 
by  content.  This  inability  to  handle  content  addressing  has 
lead  tc  interest  in  computer  architectures  which  are  mere 
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efficent  in  information  storage  and  retrieval  applications. 
Cne  cf  the  solutions  to  this  problem  is  the  database 
computer.  The  database  computer  can  be  incorporated  int c  a 
system  in  cne  of  four  ways: 

1.  Back-end  processor  for  a  host 

2.  Intelligent  peripheral  control  unit 

3.  Storage  hierarchy 

4.  Network  node. 

These  approaches  are  independent  of  each  other,  which  say 
suggest  tha*  snore  than  one  approach  could  be  included  into  a 
system’s  architecture. 

C.  DATABASE  COMFOTEB  APPRO  ACHES 

1  •  Eack-end  Processor 

The  back-end  processor  is  a  general  purpose  computer 
which  is  thought  of  as  a  master-slave  configuration.  High 
level  access  requests  are  passed  to  the  back-er.d  processor 
by  the  hcst  computer.  Access  validation,  management  cf 
storage,  update  lockout,  response  formatting,  ar.  d  I/c  opera¬ 
tions  are  all  performed  by  the  back-end  processor.  After 
the  back-end  processor  is  finished,  it  passes  the  response 
tack  tc  the  host, 

2.  Int elli g  =nt  Peripheral 

The  intellignet  peripheral  control  unit  wcrks  in 
conjunction  with  a  mass  storage  device.  Highly  repetitive 
data  accesses  are  moved  to  a  mass  storage  controller  to 
avoid  high  overhead  on  the  host  hardware  ar.  i  software. 
Functions  like  device  scheduling,  head  positioning,  data 
recovery,  searching,  sorting  and  error  correction  are 
performed  by  the  intelligent  peripheral  control  unit. 
Functions  like  sequential  associative  access  and  parallel 
read  (on  a  disk)  can  also  be  implemented.  (  Note:  An 
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associative  computer  architecture  allows  data  to  be  accessed 
directly  by  value  without  physical  addresses.) 

3 .  Stcrag s  Hierarchy 

The  storage  hierarchy  is  similar  to  the  cache  memory 
approach  in  that  both  are  concerned  with  the  locality  of 
data.  The  locality  of  access  is  such  that  data  already  used 
or  data  near  other  data  which  has  been  used,  is  very  likely 
to  be  accessed  in  the  near  future  [Ref.  9].  This  character¬ 
istic  can  be  used  to  speed  up  the  access  of  data  in  a  data¬ 
base.  The  database  cache  could  oe  inserted  in  the  system 
between  main  storage  and  disk  [Ref.  9].  If  implemented  by  a 
sequential  access  device,  such  as  CCD  or  bubble  storage, 
access  time  would  be  less  than  1  msec,  if  data  is  located  in 
the  cache.  Otherwise,  the  request  would  take  longer.  The 
fact  that  data  is  managed  cn  the  least-rscently- used  basis, 
ensures  that  most  of  the  active  data  resides  in  the  fast 
access  CCE  cr  bubble  storage. 

4  •  Network  Node 

According  to  reference  9,  The  '•network  node",  yet 
another  approach  to  a  database  computer,  is  a  general 
purpose  computer  which  communicates  with  several  other  nodes 
in  ths  system,  most  frequently  using  a  data  communications 
protocol  and  serial  channels,  but  possibly  using  I/O  chan¬ 
nels.  The  benefit  in  of  this  configuration  is  that  several 
nodes  (hosts)  can  access  a  single  shared  database,  thus 
avoiding  replication  cf  the  data.  It  is  implemented  or.  a 
general  purpose  system,  host  and  back-end,  or  processor  or 
host  and  intelligent  control  unit. 
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D.  D ATAE ASE  TECHNOLOGY 

In  the  p  ast  cecade,  -he  Database  Management  System  has 
become  more  popular.  There  are  many  gains  to  be  made  through 
the  use  cf  database  technology.  The  Database  Management 
System  relieves  the  application  program  of  many  tasks.  Yet, 
the  Database  Management  System  has  had  its  drawbacks. 
Typically,  the  software  laden  database  management  system  has 
been  large  in  size  ana  complex  in  structure,  which  ret  only 
overtaxes  the  host  hardware,  but  also  stresses  the  host 
operating  system  [Ref.  10].  Large  and  more  sophisticated 
applications  began  *o  demand  more  speed,  capacity  and 
retrieval  flexibility  on  general  purpose  computers. 
Therefore,  it's  not  surprising  that  a  way  to  alleviate  seme 
cf  the  demands  on  general  purpose  computers  has  beer,  sought. 
As  technology  improved  it  was  clear  that  a  more  efficient 
fora  cf  back-end  processor  could  be  developed  as  a  "database 
machine",  one  specially  designed  to  manipulate  and  access 
data  with  more  flexibility  than  conventional  computers 
running  general  purpose  software  [Ref.  11]. 

E.  INITIAL  RESEARCH  CF  DBMS 

Research  on  the  EEM  concept  started  at  Bell  Labs  in  the 
early  seventies  as  work  on  a  "back-end"  DBMS  (Caracas® 
Management  System)  using  general  purpose  computers  in  a 
dedicated  environment  [Ref.  11]-  As  a  back-er.d  machine,  the 
E3C  (Database  Computer)  attempts  to  achieve  high  performance 
and  lew  cost  [Ref.  10].  Originally,  the  five  goals  of  the 
D3C  were: 

1.  To  design  a  machine  with  the  capability  cf 
handling  a  very  large  online  database  cf 
10  bytas  or  greater  (The  DBM  is  usually 
net  cost  effective  on  a  smaller  database) 

2.  To  turld  a  database  computer  today  (1979) 

3.  To  have  the  DSC  compete  in  a  favoranle 
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manner  with  the  existing  0BM3  as  far  as 
system  throughput  and  cost  of  database 
storage  were  concerned 

4.  To  make  security  an  integral  part  of  the 
DBC  design 

5.  To  provide  a  repetoire  of  very  high-level 
commands  in  order  to  sufficiently  interface 
with  front-end  computers  and  support 
database  management  applications. 

P.  HCDE1S  CP  DBMS 

1 .  IBM  Family 

a.  ID M  500/1  and  I  DM  500/2 

Britton  Lee,  Inc.,  a  company  in  Lcs  Sates, 
California,  has  developed  a  number  of  relational  database 
machines.  Among  those  developed  are  the  IDM  500/1  and  IDM 
500/2,  relational  intelligent  database  machines,  as  well  as 
the  IDM  System  300/600,  a  relational  database  management 
syst  e  a. 

The  IDM  530/1  was  the  first  high-performance 
Intelligent  Database  Machine  (IDM)  on  the  market.  It  serves 
as  a  r.  auxiliary  processor  to  cra  or  more  host  computers  and 
is  driver,  by  a  high-level  query  language  which  is  resident 
in  the  hest.  It  hardies  the  relational  database  tasks  and 
manages  dedicated  database  disks.  The  IDM  500/1  has  room  for 
expansion  for  medium  to  large  datacase  applications  ar.d  is 
programmed  to  optimally  perform  retrieving,  updating, 
sorting,  etc.  The  IDM  500/2  is  basically  the  same  machine 
but  it  is  a  high-end  custom-designed  10  MIPS  database  accel¬ 
erator  model.  It  handles  transactions  2  to  10  times  faster. 
A  full  cc  jplemen*  of  database  management  functions  performed 
by  both  aembers  of  the  IDM  family  are  listed  below: 
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1.  Easic  Commands  -  Host  formatted  tc  IDM 

specifications,  which  create  and  destroy 
databases,  relations  and  in  dienes  to  data. 

Also  appending,  retrieving  and  replacing 
data  are  handled  in  relations. 

2.  Integrated  Data  Dictionary  -  Data  dictionary 

relations  are  automatically  maintained 
or.  information  about  data. 

3.  Transaction  Management  -  Ensures  that 

user-specified  transactions  are  fully 
completed  or  backed  out  in  the  event  of 
a  failure. 

4.  Concurrency  Control  -  Allows  multiple  users 

tc  safely  access  the  same  database 
s  imult  aneo  usly  . 

f.  Access  Control  -  Protects  data  by  using  such 
features  as  deny/permit  access  privileges 
ar.d  read/writs  locking  of  shared  data. 

6.  Audit  Logging  -  Maintains  check  pointed  audit 

or  transaction  logs  for  auditing,  backup 
and  recovery. 

7.  Backup  ar.d  Recovery  -  Online  dump  facility 

supports  backup  of  disks,  databases  and 
transaction  logs  to  disk,  tape  and  the 
host.  The  database  is  recovered  via  a 
load  and  transactions  are  rolled  forward. 

8.  Random  Access  Files 

9.  Stored  Commands  -  Minimizes  execution  time. 

User-defined  stored  commands  are  featured. 

All  these  functions  can  be  seen  ir.  4.1  The  architecture  is 
modularized  and  expandable.  In  figure  4.2  the  data 
processor,  main  memory,  disk  controller  and  tape  controllers 
(optional)  are  shown. 
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Host  interfacing  is  provided  by  Doth  parallel  IEEE-438  and 
serial  RS232C  host  interface  nodules  (Up  no  8  hosts) .  The 
system  alsc  has  axtra  expansion  slots  for  future  growth.  Ihs 
IDS  500/2  has  an  extra  function,  the  Database  Accelerator. 
It  is  custom-designed  and  has  an  instruction  set  which 
optimizes  relational  database  processing. 

b.  IDM  300/600 

The  IDM  System  300  and  IDM  System  6 00  are 
complete  relational  database  management  systems  for  DEC  VAX 
users  Turning  VMS  or  UNIX,  and  for  PDP-11  UNIX  users,  12 
Both  combine  an  Intellignet  Database  Machine  (IDM),  of 
Britten  Lee,  with  end-user  software  tools  in  the  host  for 
database  applications.  Included  in  the  software  are:  1)  data 
entry  facilities  2)  an  ad  hoc  online  query  language  3)  a 
report  writer  4)  program  language  interfaces  for  FORTRAN, 
COBOL  and  C  (programming  language)  5)  a  full  complement  of 
Database  Administration  utilities.  The  IDM  System  30C/600 
architecture  can  be  seen  in  figure  4.  3 

The  functions  performed  by  the  IDM  System 
303/6C0  are  the  same  as  those  listed  above  for  the  IDM  500/1 
and  IDM  5G0/2.  There  is  an  additional  function,  however. 
That  function  is  Multiple  Host  Support.  It  can  be  expanded 
to  allow  several  hosts  to  access  its  databases.  This  access 
is  provided  by  the  IDM  System  300/600  (JNI3US  Interface 
Packages.  Figure  4.4  shows  the  system  architecture.  Figure 
4.5  shows  a  summary  ct  the  maximum  IDM  capacities. 

2 .  it 8 P  86/440 

Another  consideration  is  the  Database  Processor 
(iDBP  86/440)  by  Intel.  It  is  a  microprocessor-based  rela¬ 
tional  database  management  system.  Functionally,  it  is  a 
mass  storage  controller  for  one  or  more  hosts,  software  and 
specialized  hardware  are  included  in  the  database  management 
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system  design.  It  is  positioned  between  the  hcst(s)  ar.  d  * 
set  cf  dedicated  disks.  4.6  shows  the  system  architecture. 

Ihe  iDBP  provides  a  database  management  system 
"kernel"  which  supports  relational,  hierarchical  and  network 
databases.  It  also  provides  concurrency  control,  security, 
integrity  and  recovery  mechanisms  for  sharing  cf  data.  Ir. 
addition  to  traditional,  record-oriented  files,  the  iC3P 
also  manages  unstructured  files  which  may  contain  text, 
graphics,  digitized  voice  or  digitized  imaaes  [Ref.  13]. 
Even  though  the  iD3F  86/440  is  the  first  of  a  family  of 
database  machines  by  Intel,  it  will  continue  to  be  enhanced 
as  the  new  VLSI  component  is  integrated  into  it  in  the 
future,  figure  4.7  gives  the  conf iguraoility  cf  the  iD3?  and 
its  system  capacities. 

3.  SCAR 

The  final  database  machine  to  be  examined  is  called 
NOAH,  produced  by  HER  Systems  Inc.  NOAH  ns  a  relational 
database  machine  which  provides  a  modular  architecture .  4.8 

gives  NQAfc's  hardware  configuration.  Som  3  of  HO  AH's  software 
modules  reside  on  processors  dedicated  to  database  manage¬ 
ment  and  guerv  language  functions  while  others  reside  cr.  the 
general  purpose  host.  The  functions  performed  by  NOAH  are: 

1.  C'Jsry  languags  (SQL/NOAH) 

2.  Integrated  Data  Dictionary 

2.  Security 

4.  Fee  every 

Figure  4.9  gives  specifications  and  configuration 

information . 

These  were,  cf  course,  only  a  few  of  the  database 
machines  which  have  teen  designed  and  developed  in  the  last 
few  years. 
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6.  EXPECTED  PERFORM 1NCE  OF  DECS 


1  •  Advantages  of  DBCs 

A  large  number  of  database  management  functions  are 
implemented  in  hardware.  Since  this  is  the  case,  EBC 
ccmputers  are  expected  tc  perform  quite  a  bit  better  ’•bar 
the  computers  which  provide  these  functions  in  software. 
Software  security  enforcement  may  also  be  absorbed  in  hard¬ 
ware. 

An  existing  database  may  be  supported  or.  the  EEC  by 
converting  the  database  to  conform  to  the  DBC  representation 
cf  data.  This  one  time  conversion  is  known  as  database 
transformation  [Ref.  10].  The  DBC  manufacturers  claim  that 
nc  reprogramming  of  the  database  management  application  is 
necessary,  unless  the  user  wishes  to  reformat  his  data.  An 
interface  will  translate  the  database  management  calls  into 
DBC  commands.  The  interface  requires  a  small  amount  cf  soft¬ 
ware  because  DBC  commands  resemble  high-level  data 
languages.  This  process  is  known  as  query  translation.  The 
ESC  interface  package  resides  in  the  front-end  computer.  The 
interface,  together  with  the  database  comou-sr,  replaces  a 
full-scale  software  database  management  system  and  its 
conventional  disk  storage  [Ref.  10].  The  application 
program,  however,  is  net  replaced. 

According  to  reference  10, 

It  is  estimated  that  in  supporti.nc  these  apolicat  icr.s  on 
the  EEC,  the  datatase  storage  requirement  is  as  muca  as 
1.5  or  2  times  that  in  a  conventional  system.  This 
excess  storage  requirement,  however,  is  adeauarely 
offset  by  one  or  acre  orders  of  maortude  improvement  in 
the  execution  time  cf  user  transactions.  Furthermore, 
the  storage  requirement  for  the  radicles  decreases  by 
one  or  more  orders  cf  maanitude.  Finally,  the  size  c: 
the  software  (i. e.  the  DEC  interface)  is  expected  tc  be 
several  orders  of  magnitude  smaller  than  conventional 
database  management  software. 


Today#  software  for  mainframe  computers  is 
handling  problems  such  as  recovery  from  failure,  concurrency 
control#  and  integrity  validation.  If  the  DEC  handles  such 
problems,  it  would  relieve  the  mainframe  system  cf  many  of 
the  database  software  functions. 

2.  Eisadv  ant  ages  of  DBCs 

In  spite  of  all  the  potential  benefits  provided  by 
database  computers,  there  are  seme  disadvantages  which  must 
be  pointed  out.  These  disadvantages  are  the  following: 

1.  L EM s  increased  system  complexity 

2.  CEMs  load  balancing  is  difficult 

2.  CEMs  will  create  training  and  conversion 

requirements  for  users 

When  a  system  is  designed,  complexity,  functionality 
and  cost  must  oa  considered.  A  decision  has  to  be  made  as  to 
which  of  these  issues  is  most  important  to  an  organization. 
The  back-end  processor  and  its  associated  software  adds  cost 
and  complexity  tc  the  total  computer  system  [Ref.  14],  The 
decision  which  must  be  made  by  the  organization,  however,  is 
whether  *■  hese  two  added  dimensions  will  pay  for  themselves 
ir.  the  future.  Also  to  be  considered  is  -he  fact  that  two 
hardware  and  software  systems  will  have  tc  be  maintained. 

Another  consideration  is  the  fact  that  with  conven¬ 
tional  hosts,  it  is  possible  tc  balance  the  lead  between 
computers.  Once  a  DBM  has  been  acquired  this  is  not 
possible,  since  the  EEM  is  dedicated  to  one  task,  database 
management.  However,  it  must  be  mentioned  that  cne  of  the 
main  reasons  for  acquiring  a  CBC  is  to  offload  DBMS  from  the 
host  to  the  DBC. 

Finally,  if  an  organization  has  never  used  a  data¬ 
base  computer  before,  there  will  be  a  considerable  amount  of 
time  which  will  be  needed  fer  training  and  system  conver¬ 
sion.  The  organization  must  consider  this  and  incorporate 
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Figure  4.1  Database  Management  Functions  of  the  IDH  500. 
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SUMMARY  or  MAXIMUM  IDM  CAPACITIES 


SPECIFICATION 

IDM  2QQ 

XDM  500 

Ba<«  Configuration 

5  board  set 

5  board  set 

vn  7  slot  chaser* 

vn  16  slot  t-nassis 

Expandable  to* 

I OH  Memory 

l  Mbyte 

6  Mbytes 

Qixx  storage 

4  5 MO  disks 

16  SMD  disks 

Tape  Control let 

6  transports 

9  transports 

I/O  Controller 

HS- 212  serial  and  /  or 

IEt-E-4«a  parallel 

24  devices 

64  devices 

Database  Accelerator 

No 

Yes  -  optional 

Relational  DBMS  Capacity 

Number  of  databases 

50 

50 

Relations  per  database 

32,000 

32,000 

Attributes  per  relation 

250 

250 

Tuples  per  relation 

2  billion 

2  billion 

Tuple  Width 

2,000  bytes 

2,000  bytes 

Indices  per  relation 

255 

255 

Attributes  per  index 

IS 

15 

Index  type 

B#tree 

B*tree 

Muaoer  of  Users 

128 

4,0')6 

Sti««ary  of  the  Baziaus  IDfj  Capacities. 


48 


Configurability 


System  Feature 

Memory  sue 


Options  Available 

P40K  bvres  of  RAM 
1C24K  bytes  of  RAM 


Host  interfaces: 

(may  be  intermixed 
per  system) 

Mass  storace  interfaces: 


One  to  sixteen  senal  links  (RS232) 

One  to  four  parallel  links  (IEEE  488) 

One  or  two  Ethernet  links 

One  to  sixteen  SMD-compntihle  or  Winchester  disk  drive*- 
One  to  four  SMD  or  Winchester  disk  controllers 
One  start/  stop  tape  drive 


System  capacity 

Maximum  number  of  files  32.767 

Maximum  file  size  263  Mbytes 

Maximum  number  of  databases  235 

Maximum  number  of  files/ database  255 

Maximum  number  of  items/ record  127 

Maximum  number  of  concurrent  sessions  254 

Maximum  structured  record  size  9,192  bvtes 

(equals  maximum  nape  sue) 


Figure  4.1  Configuiabilit j  of  the  iDBP  and  it's  Sysxea  Capacities. 
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Prrtimman  Sgtctoiow 

Database  Type:  Relational 
Maximum  data  capacity:  32  billion  bytes 
T ypicai  processing  rate: 

2-5  transactions  per  second 
10-25  with  Database  Accelerator  Oouoo 
Maximum  number  of  databases  per  IDM:  50 
Maximum  numoer  of  relations  I  files)  per  database: 
32.000 

Maximum  number  of  domains  (fields)  per  relation: 
250 

Maximum  number  of  tuples  (records)  per  relation: 

2  billion 

Maximum  tuple  width:  2000  bytes 
Data  tvpes:  1 ,  2.  4  bvie  integers 

1-255  byte  variable  length  character  fields 
10?)  digit  packed  decimai 
4,  8  bvte  floating  point 
Maximum  number  of  Clustering  indices  per 
relation:  I 

Maximum  number  of  non-clustering  indices  per 
relation;  235 

Maximum  number  of  domains  (keys)  per  index:  13 
Index  type:  0  tree 

ConfieunMion  information 


Base  configuration: 

•  Query  Processor 

8  Channel  Processor  Boards 
Database  Processor  Software  Loader 
SQL  Noah  Oucrv  Language 
Database  Management  Utilities 
Interlace  5  'ltware  for  Supported  Hosc(s) 


J6  Sioi  Chassis.  Po»er  Surpiv  and  Bottom  • 
Plane 

•  Database  Processor 

Memory  Timing  and  Control  Board 
Memory  Storage  Board  (256k  hues) 

Disc  Controller  (supports  up  to  4  Storage 
Module  Drives) 

Serial  or  Parallel  WO  Channel  Board 
16-slot  Chassis.  Power  Supply  and  Bottom 
Plane 

Options  ___  _ 


Additional  Query  Channels 
(Supports  Up  to  12) 

X2  Querv  Channel  Upgrade 
Report  Writer.  Query  Channel  (Available  9/83) 
IEEE  Host  Noah  Upgrade 
488  IEEE  Internal  Noah  Upgrade 
Tape  Controller  and  Tape  Drive 
(Supports  up  to  8  tape  drives) 

Database  Accelerator  (Typically  improves 
performance  by  a  (actor  of  10) 

Memorv  (256K  bvte  Array  Boards)  •  up  to 
3  Megaoytes  of  Storage 

Phvsical  Size  52'*  H  X  30"  D  X  24"  W 

Noah  Enclosure  19"  W  \  22‘*  H  X  2* D 

Weight:  Noah  Enclosure  150  Lbs. 

Max.  120  Lbs.  Avg. 

Cabinet  bav  100  Lbs. 

IDM  170  Lbs.  Max.  150  Lbs.  Avg. 

Electrical  Spec.  900  W  Max.  120  V  60  Hz 
\C  r  ion 
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4.9  specifications  and  Configuration  Irforaation. 
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7.  conclusions 


Distributed  processing  and  computer  networks  are 
enabling  computers  to  use  programs  and  data  stored  ir. 
computers  at  different  locations.  The  SPLIC2  project  is  one 
of  these  projects  in  which  these  advances  will  be  incorpo¬ 
rated.  Growth  in  communications,  minicomputers  and  micro¬ 
computers  is  making  the  use  of  distributed  processing 
possible.  Many  cf  the  jobs  which  used  tc  be  done  or.  large, 
heavily  shared  computers  can  now  be  done  on  stand  al  :ne 
miniccmpcte rs  or  mic i cccmpu tsrs  [Bef.  15].  The  advantages 
of  distributed  information  has  received  increasing  atten¬ 
tion.  The  recognition  of  these  advantages  has  provided 
impetus  tc  work  on  distributed  systems.  However,  the 
complexities  cf  such  systems  must  be  investigated  and 
resolved  if  these  systems  are  to  work  effectively. 

This  thesis  only  covered  a  small  portion  cf  SPLICE,  the 
Database  Management  Module.  A  conceptual  design  of  the  data¬ 
base  for  SFIIC2  was  given.  Since  the  database  will  be  one 
which  will  contain  data  on  Supply  parts,  the  attributes 
given  in  each  relation  were  carefully  chosen  tc  reflect 
information  needed  in  an  inventory  system.  The  relations 
which  were  designed  are  leng.  Shorter  relations  can  be 
joined  tc  produce  the  same  attributes.  Joins,  however,  can 
be  time  consuming.  Therefore,  the  longer  relations,  which  do 
net  take  as  much  time  to  provide  the  attributes  needed  to 
answer  a  query,  were  utilized. 

Among  the  problems  of  a  distributed  system  are  access 
control  and  security.  The  use  of  encryption  devices,  user 
accounts  and  passwords  are  the  minimum  in  security  features 
which  must  he  incorporated  into  SPLIC2. 
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Concurrent  updates  in  a  distributed  environment  car. 
cause  a  problem.  However,  deadlocks  ar.d  livelocKt  car. 
possibly  be  handled  by  using  locking  strategies.  Besides  the 
concurrent  update  problem,  recovery  from  crashes  must  be 
considered.  When  a  system  fails,  the  number  of  transactions 
which  were  completed  is  unknown.  There  must  be  some  type  of 
logs  or  journals  which  will  help  to  determine  not  only  waich 
transactions  were  completed,  but  also  the  source  of  the 
failure.  All  of  this  has  to  be  considered  in  light  of  the 
fact  that  ether  systems  in  a  distributed  network  acst 
continue  tc  fur.cticr.  reguardless  of  the  fact  that  one 
computer  has  failed. 

Lccking  data  is  another  problem  within  a  local  area 
network.  Whether  data  is  local  or  global,  centralized  or 
partitioned,  will  determine  the  magnitude  of  the  problem. 
Accessing  data  as  well  as  updating  data  can  be  a  bio  problem 
especially  if  locks  must  be  placed  on  the  data. 

Finally,  alternative  hardware  considerations  fer  SPLICE 
were  discussed.  The  database  computers  were  proposed  as  ar. 

alternative  to  conventional  computers.  The  different 

approaches  to  database  computers  were  given  (back-end 
processor,  intelligent  peripheral,  storage  hierarchy, 
network  node)  along  with  a  brief  description  of  each.  Also, 
several  models  of  database  computers  were  presented  with 
their  functions  and  architectural  configurations.  From  the 
information  presented  on  these  database  computers,  all  of 
which  were  relational,  we  found  that  they  have  features 

which  could  be  very  useful  for  SPLICE.  The  fact  that  they 

are  able  tc  directly  access  the  content  of  a  data  item  as 
well  as  being  able  to  offload  database  management  systems, 
make  them  very  attractive  alternatives.  Some  of  the  database 
computers  are  even  able  to  interface  with  existing  data¬ 
bases.  We  must,  however,  take  into  account  the  disadvantages 
cf  database  computers.  All  thing  considered,  database 
computers  seem  to  be  a  very  viable  alternative  for  SFLIC2. 
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